In this article, a novel technique based on the empirical mode decomposition methodology for processing speech\r\nfeatures is proposed and investigated. The empirical mode decomposition generalizes the Fourier analysis. It\r\ndecomposes a signal as the sum of intrinsic mode functions. In this study, we implement an iterative algorithm to\r\nfind the intrinsic mode functions for any given signal. We design a novel speech feature post-processing method\r\nbased on the extracted intrinsic mode functions to achieve noise-robustness for automatic speech recognition.\r\nEvaluation results on the noisy-digit Aurora 2.0 database show that our method leads to significant performance\r\nimprovement. The relative improvement over the baseline features increases from 24.0 to 41.1% when the\r\nproposed post-processing method is applied on mean-variance normalized speech features. The proposed method\r\nalso improves over the performance achieved by a very noise-robust frontend when the test speech data are\r\nhighly mismatched.
Loading....